Gold Medal Software 3

home *** CD-ROM | disk | FTP | other *** search

/ Gold Medal Software 3 / Gold Medal Software - Volume 3 (Gold Medal) (1994).iso / stats / ks33stat.arj / KS.DOC < prev next >

Wrap

Text File | 1994-02-23 | 129KB | 2,893 lines

TexaSoft's USING KWIKSTAT 3.3 (Condensed Manual) (C)Copyright 1991,1992 Alan C. Elliott Winner of the 1992 SIA Award for "Best Math or Enginnering Program" For additional information on this product, contact TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75104 (214) 291-2115, Fax: (214) 291-3400, Compuserve:70721,3145. ACKNOWLEDGEMENTS This manual was written by Alan Elliott and Marcia Stoesz. ALL RIGHTS RESERVED Note:There is important information in the file LATENEWS.DOC. To view this file enter the command TXVIEW LATENEWS.DOC. The file is on disk 3 in the 5.25 inch version. NOTE:TO PRINT AN ORDER FORM, CHOOSE THE "ABOUT" OPTION IN THE HELP MENU, YOU WILL BE GIVEN THE OPTION TO PRINT AN ORDER FORM. OR, PRINT THE FILE NAMED KSORDER.TXT. -------------------------------------------------------------------- Become a Registered User, Print Order Form 1 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- PART I AN OVERVIEW OF KWIKSTAT KWIKSTAT is a statistical data analysis program. It was designed by professional statistical consultants and researchers to allow you to quickly and easily use the most commonly needed statistical data analysis procedures. WHY USE KWIKSTAT? KWIKSTAT can help you: 1. decide the appropriate data analysis procedure to use, 2. enter data or use data already in popular formats such as dBASE, 1-2-3 or ASCII, 3. provide a complete analysis in one pass so the user does not have to run multiple programs to perform a single analysis, and 4. provide interpretations of the results to assist the user in making decisions based on the outcomes of the analyses. KWIKSTAT REQUIREMENTS KWIKSTAT is designed to run on IBM PC and 100% compatible computers including the IBM PS/2 computers. It requires PC-DOS or MS-DOS version 3.0 or higher. Your computer should contain at least 364K or more of free RAM memory. KWIKSTAT graphics require a CGA, EGA, VGA or Hercules compatible monitor. Many printers are supported. A mouse is optional. INSTALLATION To install on a hard disk, place the KWIKSTAT distribution disk in the A: drive and enter the command A:INSTALL and follow the instructions on the screen. To use KWIKSTAT on a two-disk 360K machine, place disk three (3) in the A: drive and type the command FTIPS. NOTE:This manual is a condensed version of the printed and illustrated version of the manual you receive when you register. Please ignore references to figures. -------------------------------------------------------------------- Become a Registered User, Print Order Form 2 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- USING THE KWIKSTAT MENU Once you installed and setup KWIKSTAT, to begin the program, enter the the KS command from the DOS prompt. The main menu contains three options, Data, Analyze and Helps. Using the right and left arrow keys on the cursor pad, you can move the menu selection to one of the other two menu bar options. Pressing the right arrow key once, moves the menu bar option from Data to Analyze. The Data pull-down menu vanishes and the Analyze pull-down menu appears. Pressing the left arrow key moves the selection back to the Data menu. Or, point to a menu option with the mouse and click. To select options from an extended menu (pulled-down), use the up and down arrow keys on the cursor pad to highlight the option you desire, then press the Enter key. Or, press the first letter of the option name. If you are using a mouse, point to the selection with the mouse pointer and click. USING THE ANALYZE MENU The KWIKSTAT Analyze menu allows you to choose which analysis module to run. Usually, you first open a database in the Data menu, then choose one of the options in the Analyze menu. For example, to calculate descriptive statistics on information in the EXAMPLE database, first open the EXAMPLE database by choosing Open a database to use in the Data menu and selecting EXAMPLE from the database list. Then press the right arrow key once to open the Analyze pull-down menu. Then choose the Descriptive Statistics and Graphs option from the Analyze menu. This begins the Descriptive Statistics and Graphics module, which contains a menu of procedure options. USING THE KWIKSTAT HELP SYSTEM The KWIKSTAT Help system contains two levels of help. From the "Helps" pull-down menu, you can choose the Help On Using The Program option or the Decide What Analysis To Use Option. Other options on this menu are About Kwikstat, which gives copyright information about the program and allows you to print an order form. The GO to DOS, Return with Exit (Shell) option allows you to temporarily return to the DOS prompt. The Change Setup option allows you to setup KWIKSTAT fror your computer. -------------------------------------------------------------------- Become a Registered User, Print Order Form 3 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- TUTORIAL:TRY THIS EXAMPLE This short tutorial will give you a feeling for how to use KWIKSTAT. It is not intended to be thorough, but simply to lead you though a common procedure. It will assume you are using KWIKSTAT on a hard disk. To begin KWIKSTAT, you must first be in the \KWIKSTAT directory on your hard disk. Use the CD (Change Directory) command from the DOS prompt to change to the \KWIKSTAT directory by using the command: CD\KWIKSTAT Once in the \KWIKSTAT directory, begin KWIKSTAT with the KS command: KS The Data menu that should appear is the same as was illustrated previously in Figure 1.1. (If the Analyze menu appears, press the left arrow key once to open the Data menu.) The Data pull-down menu is extended, as described above in the section "Using the KWIKSTAT Menu System". ACCESSING THE KWIKSTAT HELP SCREENS To examine the KWIKSTAT HELP menu, press the F1 function key. You can think about the HELP procedure as a book, with screens instead of pages. It is really a condensation of the manual. The KWIKSTAT Program Help screen menu was illustrated in Figure 1.3. To look at a particular topic, enter the screen number you desire. For example, look at screen 7. Type the number 7, and press Enter: Enter SCREEN NUMBER or Enter to Cancel:7 (You enter the 7.) If you enter a 7, KWIKSTAT displays screen 7. (If you press Enter without first typing a number, the Help is canceled, and you will return to the menu.) Once you have displayed screen 7, to move to screen number 8, press Enter. To go back to the menu, type the "M" key. To exit the HELP module, press the Enter key from the main Help menu or the Esc key from a help screen. Press Enter now. This takes you back to the KWIKSTAT Data pull-down menu. Every module has the help screens available. The KWIKSTAT "Decision" help screen is available from the Helps pull-down menu. To decide what kind of descriptive analysis to use for a single variable, choose help screen option 1 by typing a 1, then press Enter. A screen describing descriptive analysis options is displayed (as in Figure 1.5). For example, if your data is near normal (quantitative), use the B (detailed statistics), C (summary statistics) and/or E (histogram) options in the Statistics module. Pressing Esc will end -------------------------------------------------------------------- Become a Registered User, Print Order Form 4 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- the help system and return you to the main menu. EXAMPLE OF DESCRIPTIVE STATISTICS This example will use data already stored in a dBASE ".DBF" file named EXAMPLE currently on the KWIKSTAT disk. To open this database, extend the Data pull-down menu. Then select the Open a Database option. A "PICK" menu will appear. That is, a list of database names will be displayed, and you can pick one of the names from this menu. Choose the EXAMPLE database. (If the EXAMPLE database does not appear on the list of databases, you may not have installed the program correctly.) Once the database is opened, a notice at the bottom of the screen tells you that the database named EXAMPLE is open, and it contains 50 records. Press the "L" key once to choose the List the Contents of a database option from the menu. This will list the contents of EXAMPLE database to the screen. Press Enter several times to list the entire database to the screen. When the list is finished, you will return to the Data pull-down menu. Extend the Analyze pull-down menu. Choose the Descriptive Statistics and Graphs option from the Analyze menu. KWIKSTAT now switches to the Descriptive module (which may take a few seconds). Soon, you will see the Descriptive Statistics menu, as illustrated in Figure 1.6. From the Descriptive Statistics menu, choose Detailed statistics on a single variable. The program now displays the variables available for analysis from the database. Choose variable number 2, "AGE". Before the statistics for this variable are displayed, two options are presented. First, you are prompted you with the question: Specify Confidence Interval level (.5 to .99) (Default is .95) For this example, press Enter to accept the default, which tells the program to display a 95% confidence interval for the data on the statistics screen. Next, a second option appears with the prompt: Default for percentiles is Tukey 5 Number Summary Specify your own percentiles to calculate < yes > < No > When a Yes/No question appears on the screen, notice that the Y or the N will be uppercase. This means that if you press Enter without entering a Y or an N, the uppercase option is the default (No). For -------------------------------------------------------------------- Become a Registered User, Print Order Form 5 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- this example, to choose No to the question, just press Enter. Or point to No with the mouse and click. The program will now perform a series of calculations on the data, and will produce a screen of descriptive statistics, and a box plot of the data. The results are illustrated in Figure 1.8. Notice that this screen is different from previous screens. The information on this screen is displayed in graphics mode (if you have a graphics monitor). Normally, information on the screen is in "text" mode. If you are using a color monitor, a text mode screen will display text in the colors you selected in the setup procedure. When graphs are displayed on the screen, the program must use a graphics screen mode. In graphics mode, some graphs appear only in black and white, although some graphs will appear in color. On all graphic screens in KWIKSTAT, a menu will appear at the bottom of the screen for a few seconds, then disappear. This allows you to capture or print the screen without the menu appearing on your printout. To bring the menu back, press the spacebar once. The menu options are still available even when the menu is not visible. The menus differ according to your setup and particular options available for the graphic display, but most graphic menus will include the following options: Esc:ExitR:Replot P:Print That is, press Esc to end the display, press R to replot (choose other display options) and P to print the graphic screen to the printer. Depending on your monitor setting, the menu may also contain a "Capture PCX" option. This option allows you to capture the graphic screen into a PCX type file that you can then use in other programs such as WordPerfect or Pagemaker. If you want a printed copy of this graphics screen, MAKE SURE YOUR PRINTER IS TURNED ON, and is ON LINE, and HAS PAPER. Then, press "P" (for Print). To return to the main Descriptives menu, press the Esc key. To end this module and return to the main KWIKSTAT menu, press Esc. To end KWIKSTAT from the main menu, press Esc again and answer Y to the prompt "End KWIKSTAT." This ends the tutorial. All of the procedures are explained more fully later in the manual. However, you may find that after finishing this tutorial, you will be able to use most of the KWIKSTAT features without any further aid from the manual. -------------------------------------------------------------------- Become a Registered User, Print Order Form 6 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Part II Using the KWIKSTAT Database The KWIKSTAT DATA pull-down menu in the main KWIKSTAT module is used to manage your data. From this menu you enter data, change data, create new data fields from existing ones, and perform other data maintenance tasks. Once your data is in the KWIKSTAT (dBASE-type) database, you can access the data from any of the other KWIKSTAT modules. Some of the KWIKSTAT procedures require that you have data in a file before you can do an analysis. Other procedures allow you to enter information from the keyboard at the time you request the procedure. Some procedures give you an option to enter data from a database, or from the keyboard. HOW DATA IS STORED IN KWIKSTAT A KWIKSTAT database uses the same file format as the dBASE III and dBASE IV programs. Therefore, data already stored in a dBASE III or dBASE IV file may be read directly into all the KWIKSTAT programs. The only exception to this is that KWIKSTAT does not read dBASE MEMO fields. Therefore, if your data in dBASE contains memo fields, you may have to create a subset of your database before using it in KWIKSTAT. Data from other programs can also be used in KWIKSTAT. Refer to the section called "Entering Data into the Database." The menu for the data options appears as the DATA option on the main KWIKSTAT menu. The following information describes how to use the options in this menu to create, manipulate and modify a database for use in KWIKSTAT. The Open a database to use option on the DATA menu allows you to access information in a dBASE file that you created in KWIKSTAT, in dBASE, or in any other program that creates .DBF files. Use this option to choose the database that you will be analyzing. When you choose the OPEN option on the DATA menu, a list of databases currently in the default directory will be displayed, as shown in Figure 2.2. To select a database, use the up and down arrow keys to highlight a database name, then press Enter. Or, point to the entry with the mouse pointer and click. If the database you want to use is not in the current (default) directory, you can temporarily change the default directory by pressing the F2 function key. Once a database is open, you will see its name at the bottom left of the screen, along with the number of -------------------------------------------------------------------- Become a Registered User, Print Order Form 7 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- records in the database. You can edit, pack, modify, set missing values, subset and list the database using the other options on the DATA menu. DESIGNING AND CREATING A DATABASE The Create a new database option on the DATA menu is used to create a new database. The structure, or layout, of a database must be described before you enter your data. You need to give some thought to how your database will "look" so it will be in the proper format to do the analysis you desire. In the descriptions of statistical procedures (Part IV), specific examples are given about how a database should be constructed for a particular type of analysis. Kwikstat allows you to create a new database in two ways: 1. Choose from a predefined structure or 2. Create a customized database Both of these options are discussed in the sections below. USING A PREDEFINED DATABASE STRUCTURE You can choose to create a custom database structure (which was the only choice for version 3.0 and earlier) or you can choose from a list of pre-defined databases that are designed for specific analyses. The list below contains examples of some of the pre-defined database descriptions. For example, if you need to enter data for an independent group t-test, you would choose the option called "For independent group t-test or ANOVA." The proper database structure for this analysis will be created and then you can enter your data into the database. DEFINE THE FIELDS IN YOUR DATABASE When you first enter the definition mode, the blinking cursor will be in the FIELD NAME area. Enter a name, (must begin with a letter, can contain letters, numbers and "_" (underscore) and may be up to 10 characters long) and press the ENTER key. The name you choose will be displayed in all capital letters, and the cursor will move to the next area, TYPE. In the TYPE area, you only need to enter the first character of the type (N, C, L or D), - Numeric, Character, Logical or Date, then press the ENTER key. -------------------------------------------------------------------- Become a Registered User, Print Order Form 8 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- WIDTH is the number of characters reserved for the entry. Decimal is the number of decimal places (only for numbers). Note that the number of decimal places must be at least one less than the width. For example, if a number has the format ###.##, the width is 6 (count the decimal point), and the number of decimal places is 2. If DATE or LOGICAL is entered as type, the program will automatically assign a width of 8 or 1 respectively. Unless your database is big, you might make each field one more character wide that you actually need. This provides for unanticipated large numbers and facilitates data entry. Once a complete field description is entered, a next blank field description will appear, ready for entry. To end the creation process, type Ctrl-END (^END). As long as you have not ended the procedure, you may use the cursor keys to back up, and make any corrections. If you mess up, end the procedure with Esc and begin again. When you press ^End, the following message appears. Enter Records (data) into the database now <yes> <No> If you want to enter data now, answer "Y" to the question. Otherwise answer "N". You can always enter the data later. LIMITATIONS TO THE KWIKSTAT DATABASE Maximum of 250 fields. Maximum width of a field name is 10 characters. Maximum width of a cell is 60 characters (15 for numbers). Dates are always 8 characters and logical fields are 1 character wide. Memo fields are not supported. DATABASE AND ANALYSIS EXAMPLES This section provides you with two examples of using Kwikstat. Please go over these examples before creating your own database and performing your own analysis. Following these examples will answer a number of questions you may have about how to use Kwikstat. The first example shows you how to create a custom database and calculate some simple statistics and a graph. The second example shows you hot to use a pre-defined database structure to perform a t-test. -------------------------------------------------------------------- Become a Registered User, Print Order Form 9 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- EXAMPLE 1 DESCRIPTIVE STATISTICS EXAMPLE This example shows you how to enter data and perform some simple statistics and graphs. It will show you both the spreadsheet and database entry screens. The data that will be used is listed below. The GRADE variable is the grade received in the class, AGE is age, SEX is sex, WT is weight and SCORE is the score on a pre-test (maximum of 25 points). In database language, these variables are called fields. GRADE AGE SEX WT SCORE 1 A 18 M 165 22.3 2 B 19 M 145 22.8 3 B 17 F 122 22.8 4 C 18 M 196 18.5 5 B 17 M 188 19.5 6 B 18 F 140 23.5 7 C 19 F 121 22.6 8 B 20 F 112 21.0 9 C 19 F 122 20.9 10 A 18 M 176 22.5 11 B 18 M 165 23.3 12 A 19 M 135 21.8 13 A 18 F 121 24.8 14 C 19 M 186 16.5 15 B 17 M 148 18.5 16 A 18 F 140 24.5 17 B 16 F 101 23.6 18 A 21 F 111 20.0 19 B 17 F 124 21.9 20 B 18 M 176 21.5 Before performing any kind of analysis on this data, you must first enter it into a Kwikstat database. The process is: 1. Create a database 2. Enter the data 3. Perform an analysis These processes will be explained in the next few sections. -------------------------------------------------------------------- Become a Registered User, Print Order Form 10 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- CREATE A DATABASE When you begin the Kwikstat program, the main Kwikstat menu appears. If you have not yet created the database, you must choose the Create a new database option, which will lead you through the steps in creating a database. This section describes that procedure. Note: Once a database has been created, you can use the data in it again by choosing "Open a database to use" from the Data menu. When you choose Create a new database from the Data menu, you will then be prompted to enter the name of the database, as shown in figure 2.6. You need to enter a name for the database that is a DOS compatible file name such as MYDATA. Once you have entered a filename for the database, you can choose from a list of pre-defined database structures, or create your own. In this example, you will create your own database structure. From the menu shown in figure 2.7, choose the CREATE A CUSTOMIZED DATABASE option. For each field (each item of data) in the database, you must specify a fieldname, a type a width and optionally the number of decimal places. For the data in this example, you will use the following information: Field name Type Width Dec GRADE C 2 AGE N 3 SEX C 2 WT N 4 SCORE N 5 1 The GRADE and SEX variables are of type "C" (Character) and the rest of the variables are numbers "N". Notice that the widths defined here are actually 1 character wider than actually needed. If you are not pressed for space in the database, this will make your listings easier to read. Only the SCORE variable requires a decimal value. Enter the information about the database structure into the database definition screen (see the section,"Define the Fields in Your Database" above) as shown in figure 2.9. When you will be prompted with the question, Enter records (data) into the database now? Type a Y to begin entering records into the database. -------------------------------------------------------------------- Become a Registered User, Print Order Form 11 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- ENTER THE DATA When you choose to enter the data in a new database, an entry screen will appear listing the names of all of the fields and an area to enter the data. Kwikstat includes two types of data entry screen, database type and spreadsheet type. In the Setup routine, you chose one of these two entry options. The following discussion shows you how to enter data in either screen. TIP: You can toggle between spreadsheet entry mode and database entry mode by pressing the F8 (Switch) key. USING A SPREADSHEET ENTRY SCREEN The spreadsheet screen, as shown in figure 2.10 looks similar to a spreadsheet. If you prefer to use the database entry mode, skip to the section titled "Using a database entry screen." The names of the database fields (Grade, Age, etc.) are listed at the top of the screen (columns) and the record numbers are listed down the left side of the screen (rows). Since you do not have any records entered into the database, the only row displayed is the -ADD- row, which indicates that you are adding a new record. To enter data into the database, begin typing the entry for the first field (GRADE). Type an A (upper case), then press Enter. Your cursor moves to the next field (AGE). Type 18 and press Enter. Type upper case M and press Enter. Continue until you have entered 22.3 in the SCORE field. When you press Enter after entering 22.3, a new row appears to allow you to enter the second record of information, and your cursor moves to the first field of this record. Continue entering information in the spreadsheet until all records are entered. If you make a mistake on a record, you can use the right or left arrow keys to move your cursor and correct the mistake. If you discover that you have made an error in a previous record, you can use the Edit mode (described later) to correct this entry. When you have finished entering the information in the database, your screen will look like figure 2.11. To end the entry procedure, press the F7 (Exit) key. A message will appear on the screen: Before exiting, do you want to save the record number 21 <yes> <No> Answer No to this question since you do not want to have a blank record in your database. -------------------------------------------------------------------- Become a Registered User, Print Order Form 12 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- CORRECTING ERRORS IN THE DATABASE Before returning to the main menu, you can correct errors by pressing the F2 key to toggle into Edit mode. The edit screen is similar to the screen used to enter data. Use the cursor keys to move to the field to edit, and change the value. Exit the edit screen with the F7 (Exit) command. If you end up with an extra record in your database, you can erase that record while in the Edit mode. To erase a record, place your cursor on the record and press F4 (Erase). The record will be permanently removed from the database. Exit the edit screen with the F7 (Exit) command. USING A DATABASE ENTRY SCREEN This section describes how to enter data using the database entry mode. When you begin entering data into a new database, an entry screen for record 1 appears on the screen similar to figure 2.12. (Information for record 1 already entered). The database entry screen displays each field name at the left of the screen followed by an entry field where you will enter the data for that field. For example, when the entry screen first appears, your cursor will be in the GRADE field. To enter the information for record 1, type the grade value for the first record, A (upper case) and press Enter. Your cursor will move to the next field. Type 18 and press Enter. Continue until you have entered 22.3 in the SCORE field. When you press Enter after entering 22.3, a new entry screen appears for record 2 to allow you to enter the second record of information, and you cursor moves to the first field of this record. Continue entering information until all records are entered. If you make a mistake on a record, you can use the arrow keys to move your cursor and correct the mistake. If you discover that you have made an error in a previous record, you can use the Edit mode (described later) to correct this entry. ENTRY AND EDIT SCREEN FUNCTION KEY COMMANDS When you are adding information to the database, there are several function key options that you can choose. These options are listed at the bottom of the entry screen. To choose an option, press the function key related to the option, or point to the option with the mouse and click. F1 - Displays the Kwikstat Help menu. F2 - Toggles between edit mode and append mode. F3 - Marks a record for deletion. (Same as ^U.) Also undeletes records. F4 - Erase the current record permanently from the database. (Only in spreadsheet entry mode.) -------------------------------------------------------------------- Become a Registered User, Print Order Form 13 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- F5 - Goto a record number. F6 - Undo - returns last record changed to it previous values. F7 - Exits entry mode and returns you to the main Kwikstat menu. F8 - Switches between spreadsheet type entry and database entry mode. F9 - Insert or Delete a field in the database or Replace the contents ofa field. F10 - Prints the contents of the current record to a printer or file. PERFORMING AN ANALYSIS Once you have entered your data into the database, you are ready to perform one or more analyses. Exit the data entry mode by pressing the F7 - Exit key. You will return to the Kwikstat main menu. All of the Kwikstat analysis procedures are listed in the Analyze menu. With the Kwikstat main menu displayed, you can press the right or left arrow key to pull-down the Analyze menu. For the MYDATA database you have just created, you will calculate some summary statistics and display a graph. The sections below lead you through these procedures. CALCULATING SUMMARY STATISTICS From the Analyze menu on the main Kwikstat menu screen, choose the Descriptive Statistics and Graphs option. A new menu will appear containing the options for the Descriptive Statistics and Graphs program module. The Descriptive Statistics and Graphs menu, lists the statistics and graphs you can produce from information in the current database. For example, suppose you want to calculate summary statistics for all of the numeric variables in your database. To do this, select the option called Summary Statistics on a number of variables. A screen will appear prompting you to specify what fields to use in the calculations. A list of the variables in the database appears. Enter the field numbers for each variable you want included in the analysis, separating each variable number with a comma. For example, if you want to choose variables 2 (AGE), 4 (WT) and 5 (SCORE), enter 2,4,5 at the prompt, as shown at the Enter: prompt in figure 2.15 and press Enter. Once you select the variables to use, you are asked if there is a grouping variable. -------------------------------------------------------------------- Become a Registered User, Print Order Form 14 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Enter a grouping variable by number or Enter for none: A grouping variable allows you to calculate summary statistics by group, such as SEX. You could enter a 3 at this prompt to specify that you want the summary statistics broken down by SEX, but for this example, simply press Enter to specify no grouping variable. The next question asks you if you want to Choose (C)ontinue (F)ile (P)rint or (R)eturn to menu: To continue with the calculation and display the results on the screen, just press Enter. When you do, the summary statistics for the selected variables will appear on the screen. Once you have examined these results, press Enter to return to the Descriptive Statistics and Graphs menu. Example 2 Using a Pre-defined Structure This example shows how you would perform an independent group t-test in Kwikstat using one of the pre-defined database structures. The example uses the spreadsheet entry type. The data used in this example is the same as in example 4.7 in the manual (page 4-23 and following.) In this example, 13 plants were randomly allocated to two groups. Group one received the present fertilizer and group 2 received a newer fertilizer. After a period of time, you observed the heights of the plants were observed. The results are: Data for independent group t-test (fertilizer study) Present Newer Fertilizer Fertilizer 46.2 cm 51.3 cm 55.6 52.4 53.3 54.6 44.8 52.2 55.4 64.3 56.0 55.0 48.9 In order to enter this into a database, you must assign group numbers (or letters) to each group. For example, we will call the "Present Fertilizer" group 1 and the "Newer Fertilizer" group 2. -------------------------------------------------------------------- Become a Registered User, Print Order Form 15 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- CREATING THE DATABASE AND ENTERING DATA Since the observations are independent, the database will include thirteen records (one for each plant) and two fields (one for the response and one for the group indicator). Choose to create a database named TTEST. A screen with the instruction "Choose the database type to create from the menu below" will appear. Since you are performing an independent group t-test, you can select the option titled For Independent Group t-Test or ANOVA from this list. This process automatically builds a database structure suitable for entering data for this kind of analysis. In this case, the database will contain a grouping field (where you will enter a 1 or 2, the fertilizer type) and an observation field (where you will enter the height.) Once you have selected a database type, you will be asked if you want to enter records now. Answer Yes. Enter the data into the database. The data you will enter in the first record is 1 (press Enter) and 46.2 (press Enter). When you type the 46.2 and press Enter, your cursor will automatically move to record number 2, where you will enter 1 and 55.6, and so on. Enter the data for the thirteen records. For each record of a "Present Fertilizer" observation, enter "1" for the GROUP variable. For the "Newer Fertilizer" observations enter a "2" for the GROUP variable. The eighth record is 2 and 51.3. Figure 2.21 shows a screen where all 13 records have been entered, and the program is waiting for a 14th record to be entered. Since there is no 14th record, press the F7 function key (Exit) to end the data entry process. You will be prompted with the question, Do you want to save the record number 14?" You do not want to save this blank record, so answer N (No). KWIKSTAT will return to the Data main menu. PERFORMING THE ANALYSIS Once you have entered the data into a database, and you are back at the main menu, select the Analyze option at the top of the main Kwikstat menu screen. When the Analyze pull-down menu appears, select the t-tests and Analysis of Variance (ANOVA) option. When you choose the t-tests and Analysis of Variance (ANOVA) from the Analyze menu, the menu shown in figure 2.23 will appear. Select Compare independent groups (t-test, ANOVA). You will be prompted to choose the field name of the group, which in this case is simply GROUP. Choose GROUP. Next, you will be asked for the data field. -------------------------------------------------------------------- Become a Registered User, Print Order Form 16 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Choose OBS (HEIGHT), the response variable. KWIKSTAT will now perform the calculations and display the results on the screen. Refer to the section "Using t-tests and ANOVA Procedures" in Part IV of this manual for information on interpreting this information. When the results screen is displayed, typing G will cause a graphical comparison of the two samples. First, a screen containing Tukey's five number summaries (listing the 0, 25, 50, 75, and 100 percentiles for each group) appears. Press Enter, and box plots for each group will appear as shown. Press Esc to exit from the box plots. You will be given the option to print a report for this analysis. If you choose this option, a summary of the analysis will be printed to the printer or to a file. After you have printed the report (or chosen not to print the report), you will return to the module menu. To return to the main Kwikstat menu, press Esc. To end the program and return to DOS, choose the F option, Quit to DOS. ENTERING DATA INTO THE DATABASE When you choose the Data entry option from the DATA menu, you will be asked to specify entry from the keyboard or from a file (ASCII file). For most small data sets, you will probably enter data from the keyboard. If another program supports ASCII, dBASE and 1-2-3 type files, you may be able to enter data from that program in to KWIKSTAT. The following information describes how to enter data from the keyboard, from an ASCII file or from other programs. ENTERING DATA FROM THE KEYBOARD If you choose KEYBOARD data entry, an entry screen will appear containing the fields you created in the CREATE option. This entry screen will either use the database or spreadsheet format mode, depending on which one you specified when you chose setup options. However, you can easily toggle from one entry mode to another by pressing the F8 key. Examples 1 and 2 above describe these two entry methods. ENTERING DATA FROM AN ASCII FILE When you choose to enter data from an ASCII file, you will be asked the name of the raw data file. (i.e., \MYDIR\MYDATA.DAT). The data from the ASCII file will be entered into the database, and a count of the records as they are entered will be displayed. If there are already records in the file, the new data from the ASCII file will be appended (added) as new records to the database. It is a good idea to go to the List procedure to look at the data to verify that it has been entered correctly, or print the data out using the report option -------------------------------------------------------------------- Become a Registered User, Print Order Form 17 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- (See Using KWIKSTAT Utilities). If the data does not match the fields, refigure the widths of each field to make sure it matches the columns of data in the disk file, and try again. KWIKSTAT can read data from standard ASCII text files. These kinds of files are usually supported by most word processing programs (such as WordPerfect DOS Text Mode) as well as most text editors such as EDLIN. Data must be in the form of column data, like this... A 22 3.3 WF A 33 4.2 BF B 27 3.3 WM : ETC Notice that each column of data is in fixed fields. It does not matter that there is no space between the last two fields (Race and Sex) since the program will pick off the information from the column and does not require that there be spaces between the columns. Use the instructions below to prepare the KWIKSTAT (dBASE) database structure to be used to read in ASCII data. The steps to enter ASCII data into KWIKSTAT are: STEP 1. Use the CREATE option to create a database structure to match the columns in the ASCII file. The field widths MUST match the width of the columns of data on file. If there are spaces between columns of data, make widths wide enough to account for those spaces. The following data is from the file EX.DAT on disk: A 12 22.3 25.3 28.2 30.6 5 A 11 22.8 27.5 33.3 35.8 5 B 12 22.8 30.0 32.8 31.0 4 A 12 18.5 26.0 29.0 27.9 5 B 9 19.5 25.0 25.3 26.6 5 : etc : B 12 22.4 27.2 31.8 35.6 4 Try your hand at doing this example by creating a database named EX with the following structure: -------------------------------------------------------------------- Become a Registered User, Print Order Form 18 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- FIELD NAME TYPE WIDTH DECIMALS ----------- ------ ------- -------- GROUP C 2 AGE N 4 0 TIME1 N 5 1 TIME2 N 5 1 TIME3 N 5 1 TIME4 N 5 1 STATUS N 2 Notice that even though the first column has data 1 column wide, this structure uses a width of 2 for GROUP. Even though the age only uses 2 columns, the structure calls for AGE to have a width of 4. These widths are enter this way to take care of the blank spaces between the columns. If GROUP had been set up with only 1 column and AGE with only 2 columns, the ASCII data would not be read into the database correctly. Create the database called EX with the specifications listed above, then go to the next step. NOTE: KWIKSTAT can also produce an ASCII text file, so that data created in KWIKSTAT can be output, and read into other programs. STEP 3: To verify that the data was read properly, use List option on the DATA menu to examine the resulting database. ENTERING DATA FROM ANOTHER PROGRAM KWIKSTAT was designed to read dBASE, comma delimited ASCII and 1-2-3 files because these are among the most commonly used types of files to store data. See information on the utility module later in this manual. EDITING, DELETING AND PACKING DATA Once a database is created, you often need to correct information by editing records or getting rid of records. The following sections describe the process of changing the contents of a record by editing and a procedure for getting rid of records by deleting and packing. EDITING RECORDS If there is a need to change data already in a database, you may choose the Edit a record option from the DATA menu. You will be asked to specify the record number you wish to edit. Editing is similar to entering data. Use the up and down arrow keys to move from field to field within a record. See the Examples above for a tutorial on editing information in a database. -------------------------------------------------------------------- Become a Registered User, Print Order Form 19 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- DELETING RECORDS If you want to delete an entire record within a database, use the edit procedure to display the record to delete. You can use one of two methods to delete a record: 1. Erase the record (Spreadsheet mode only) or 2. Delete and Pack To erase a record, display the record in Edit mode, using the spreadsheet entry mode. Highlight the record to erase, and press the function key F4. If you want to delete more than a few records, it will probably be more efficient to use the Delete and Pack method. While a record is displayed (either in spreadsheet or database edit mode), press ^U to mark the record for deletion. A **DEL** will appear on the screen (upper right corner) of a "deleted" record (database mode) or a "*" will appear next to the field name (spreadsheet mode). You can mark as many records as you choose. If you accidentally mark a record for delete, pressing ^U a second time will cancel the mark, and the **DEL** will disappear from the screen. Once you have marked the records for delete, pack the database, as described below. TIP:If you want to temporarily "get rid" of a few records so that they will not be used in an analysis, mark them for delete. Any analysis you perform will ignore deleted records. Then, if you want to restore them, unmark them again. This is a quick way to see how an analysis result would change if some selected records were not present in the analysis. PACKING THE DATABASE The records marked for delete are not actually deleted from the file at this point. However, they will be ignored in most analyses, and will continue to be displayed when you edit the database. You can undelete records from the Edit mode. If you want to permanently get rid of the records you have marked for delete, choose the Pack procedure from the Data menu. This procedure erases all "deleted" records from the database. -------------------------------------------------------------------- Become a Registered User, Print Order Form 20 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- MODIFYING AND DISPLAYING THE STRUCTURE The Modify or Display database structure option on the DATA menu allows you to display the structure of your database, and allows you to change characteristics about the database structure. When you choose to display the structure, a list of all field names, their types, widths and decimals (if any) are listed. When you choose to modify a field, you age given a chance to the modify the characteristics of that field. Your options are: Delete the Field Change Name of Field Change Type of Field Change Width of Field Change Number of Decimal Places If you change the type of field, say from character to numeric, the program will attempt to convert the contents of the field to its new type. When you modify a database, you will be asked to enter the name of a new database. This means that the modified database will be in a new file, and your old original database will remain intact. If you no longer want the old database, you must delete it by choosing the Kill option from the Data menu. SETTING MISSING VALUES CODES Sometimes in the collection of data there are values that are lost or cannot be gathered. These are called "missing values". When such values occur, it is important for the program to know that the values are missing so that statistical calculations may take this into account. Missing values are usually designated as an impossible value. For example, the missing values designated for the variable AGE may be -9, since it is impossible for the variable AGE to have the value -9. When the program is asked to calculate the mean of age, for example, it will ignore those records where AGE is -9 in that calculation if -9 has been specified as the missing value code. In most KWIKSTAT procedures, there is a casewise deletion of the record from calculation whenever a missing value is encountered. Once you designate a missing value code for a variable, it is up to you to make sure that this code gets placed into your database in the proper records and fields. For example, if you have designated -9 as the missing value code for AGE, you must make sure that in your database a -9 appears in the field AGE if that data is missing or unknown. A standard dBASE III file does not have a way to designate missing values, but KWIKSTAT allows a way for you to designate these values in -------------------------------------------------------------------- Become a Registered User, Print Order Form 21 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- this program. The Indicate missing value codes option on the DATA menu is used to set up these values. When this option is selected, the program will display an entry screen that is similar to a data entry screen. You may enter one missing value for each field name. The missing value must obey the definition of the field in terms of length and type. Once missing values are entered, they are stored on disk in a file named filename.MV, where "filename" is the name of the designated database. If a new variable is created using the transformation procedure, its missing value is appended to the missing value file. You may change or correct the missing values for a database at any time by calling up this option. If missing values are already designated for the database, they will be displayed on the entry screen, and you may edit them or accept them as they are. IMPORTANT NOTE: If missing values are NOT used, and there is a blank numeric variable in a calculation, it will be treated like the value 0 (zero), so it is important to use missing values if your data contains such entries. Otherwise, the statistical calculations will be in error!! CREATING A NEW FIELD In previous versions of Kwikstat, to create a new variable, you chose the Transformation option. This option only allowed the creation of numeric variables. This option have been replaced by a procedure in the Edit mode, which allows you to create new blank fields of any field type, and to place information in those fields that is either a numeric or character expression. Thus, this procedure replaces the old Transformation procedure. The sections "Create a New Field" and "Replacing the Contents of a Field" describe these procedures. You may create a new field in a database within an edit screen by choosing the F9 (FIELD Insert) option. After creating a new field, you can then use the F9 (FIELD Replace) option to place a value in the new field. When you choose the Field option in the edit screen (F9), you will be prompted to enter information about the new field. Define a name for the new field Define the field type Define a width for the new field For numeric variables, Define the number of decimals, if any Define a missing value code. If none is selected, it is assumed to be 0 (zero). CAREFUL ATTENTION must be paid to the definition to assure that the -------------------------------------------------------------------- Become a Registered User, Print Order Form 22 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- calculated numbers will fit into the field width specifications. If the calculated number is too large to fit into the field, it will be given the missing value code. If an illegal calculation is attempted, such as a division by 0, the result will be missing. If a calculation includes a missing value, the result will be a missing values. TIP: To create a new field containing a new value that is a numeric transformation of other fields, first insert the new field using the F9 Field/Insert option, then use the F9 Field /Replace option to place the value in the new field. REPLACING CONTENTS OF A FIELD You can use the F9-Field Replace option in the Edit screen to replace the existing contents of a field, or place new information in a newly created blank field. Kwikstat provides a number of numeric and character functions to enable you to do this. For example, if you wanted to replace the contents of the field RATIO with the values WEIGHT/HEIGHT: 1) In the edit mode, highlight the field whose contents you want to replace. Press the F9 (Field) option, and choose Replace the contents of a field option from the Field menu. A dialog box will appear. 2) Specify which records to replace. The default it ALL, which means all records in the database. Or, enter a range such as 1-20, which would mean only perform the replacement in records number 1 through 20. Then, press Enter. 3) Specify what to place in the field. For example, enter the formula WEIGHT/HEIGHT in the Replace With entry field, where WEIGHT and HEIGHT are two other fields in the same database. 4) Specify any condition for replacing, if any. The default is NONE. For example, if you only want the replacement to be for records whose value of AGE is greater than 20, you would enter the expression AGE20 in the condition entry field. 5) Press F7 when you have finished entering the Replace information, and the replace will begin. When it is finished, you will return to the edit screen. An example of the Replace dialog box is shown in figure 2.28. The kinds of expressions you can use the Replace With and Condition fields are described below. -------------------------------------------------------------------- Become a Registered User, Print Order Form 23 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- KWIKSTAT supports two kinds of expressions. One is strictly for mathematical expressions, called a math expression. Thhe other expression type, called a database expression, allows the use of character, numeric, date and logical fields in the expression. Here are the criteria for when these are used: REPLACE WITH FIELD: Use either a math expression or a database expression. CONDITION FIELD: Use only a database expression. In the REPLACE WITH field, the default expression type is the database type. In order for an expression to be evaluated as a strictly math expression, you must place an equal sign "=" at the beginning of the expression. The major difference between the two expression types are in their capabilities. The database expression can handle most common calculations, including simple math, string evaluation, and date evaluation. The math expression can be used only for strictly numeric calculations using one or more of the functions listed in the table below, or that uses the exponentiation operator. For example, if you want to perform the calculation WEIGHT/HEIGHT, you can enter the expression as-is in the REPLACE WITH field. However, if you want to calculate the log of WEIGHT/HEIGHT, you must enter the experssion as =LOG(WEIGHT/HEIGHT) since the LOG function is not supported as a database expression function. The equal sign signals to the program to use the math calculator. The information below outlines the capabilities of both expression types. Mathematical operators supported are Add +, Subtract -, Divide /, Multiply *, and Exponentiation ^ (Math calculator only). For Character fields, the database calculator supports the operation: Add + (appends one string to another). Following are a few examples of correct expressions: AGE/HEIGHT =SCORE^2 (= signals math calculator) LTRIM(FIRST)+' '+LAST Note: Literal strings included in expressions must be surrounded by -------------------------------------------------------------------- Become a Registered User, Print Order Form 24 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- single quotes. For example, 'Hello' is a literal string. Character field names are used without quotes. For example, NAME is a field name. A correct string expression using these two strings would be: 'Hello '+NAME TIP:Unless you use scientific functions in your calculations, you don't need to be concerned about which calculation type to use. Only if you use a numeric operation or function not supported by the database calculator will you need to place an equal (=) sign at the first of the expression. Database calculator functions supported The following functions may be used in expressions both in the Replace With and Condition fields. Database Calculator Functions ABS(NUM), ASC(STG), AT(STG1,STG2), CALENDAR(NUM), CAPS(STG), CHR(NUM), DATE(), DELETED(), IIF(LEXP,AEXP1,AEXP2), INT(NUM), JULIAN(DATE), LEFT(STG,NUM), LEN(STG), LOWER(STG), LTRIM(STG), MAX(NUM1,NUM2), MIN(NUM1,NUM2), RECNO(), REPLICATE(STG,NUM), RIGHT(STG,NUM), RTRIM(STG), SPACE(NUM), STR(NUM), STRING(NUM,NUM|STR), RIGHT(STG,NUM), STUFF(STG,NUM,NUM,STG2), SUBSTR(STG,NUM,[NUM]), TIME(), TRIM(STG), UPPER(STG), VAL(STG), Following a few example uses of these functions: ASC - Converts the first character of a string to its ASCII code. For example, the function ASC('A') would return the value 65, since 65 is the code for an uppercase A. AT - Returns the starting position of one character string within another character string. For example, the expression AT('Bill', 'Wild Bill') = 5 since the string 'Bill' begins five characters deep in the string 'Wild Bill'. CHR - Converts a number into its ASCII value. For example, CHR(65) is equal to the character string 'A'. DELETED - Returns a T if the current record is marked for delete, else it returns a F. Can be used to conditionally replace a value depending on if the record is deleted or not. INT - Rounds down to nearest integer. INT(3.2) would be returned as 3. LEFT and RIGHT - Returns the left or right portion of a string. For example, LEFT('Wild Bill',3) would return the string 'Wil' and RIGHT('Wild Bill',3) would return the string 'ill'. -------------------------------------------------------------------- Become a Registered User, Print Order Form 25 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- LOWER and UPPER - Returns lower or upper case string. For example, LOWER('Wild Bill') would return 'WILD BILL'. LTRIM, RTRIM and TRIM - Trims blanks from right, left or both ends of a string. For example, LTRIM('Wild Bill ') would return 'Wild Bill'. VAL - Returns the value of a string. For example VAL('24') is the number 24. Most of these functions are similar to or identical to functions used in the BASIC language or in dBASE or other database programs. For more examples, you might refer to documentation on these programs. MATH EXPRESSIONS The following functions are supported only in the Replace With entry field, and only for numeric field types. You MUST preceed expressions using these functions with an = sign. An example of the RECODE function, which appears on the following table is: =RECODE(SCORE,1,AGE,10,15) The five arguments in the RECODE function are: No. Example Meaning 1 SCORE Field to use in compare 2 1 Value to assign if comparison is true 3 AGE Value to assign if comparison is false 4 10 Low range of field to compare 5 15 High range of fields to compare Thus, this example means that the value of the RECODE is 1 if SCORE is between 10 and 15, else the value is the current value of the AGE field for that record. Math Calculator Functions ABS(NUM), ASIN(NUM), ATAN(NUM), ATAN2(y,x), CSC(NUM), COS(NUM), COT(NUM), EXP(NUM), INT(NUM), LN(NUM), LOG(NUM), MAX(1,T2,3) MIN(1,T2,T3), MOD(NUM1,NUM2), PI, RAND, RECNO, RECODE(NUM1,NUM2,NUM3,NUM4,NUM5,NUM6), ROUND(NUM,DEC), SEC(NUM), SIN(NUM), SQRT(NUM), SUM(NUM1,NUM2...), TAN(NUM), -------------------------------------------------------------------- Become a Registered User, Print Order Form 26 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- SUBSETTING THE DATABASE The Subset database option on the DATA menu allows you to create a new database from an old database. The new database can be a subset of the old one, using a conditional criteria for outputting information from the old database to the new one. For example, suppose you have a database with a field GROUP with values 1, 2, 3, 4 and 5. You want to create a database that does NOT include Group 5. After choosing Subset database from the DATA menu, you are asked for the name of the new database. For example, your new database might be named NO5.DBF. You are asked for the field name to be used in the selection criteria. In this case, you would choose the field named GROUP. Next you must enter the selection relationship. It will be described as a numerical expression. The conditional operators you may use are: = > < >= <= <> = and the logical operators .NOT., .AND., and .OR.. It is important that a dot (.) appear before and after each logical operator. For example, you might enter a condition such as AGE <10 .OR. SEX='M' When you choose the Subset option from the Data menu, a Subset dialog box appears on the screen. There are two items you must enter in the Subset dialog box. First is a name for the new database. This must not be the same name as the current database. Then, you must enter the subset criteria. Example of subsetting criteria are: GROUP = 4 GROUP> STATUS GROUP < WEIGHT*HEIGHT TIME1 = TIME2*1.96 SEX = 'F' TIME1 <=20 .AND. SEX = 'M' When creating these expressions, you can use the same functions as was previously described in the table "Functions Supported for Character, Date and Value Expressions." -------------------------------------------------------------------- Become a Registered User, Print Order Form 27 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- LISTING THE DATABASE TO THE SCREEN The LIST option on the DATA menu allows you to look at the information in your database. The list produces an on-screen report that lists the data one record at a time. If your database contains too many fields to be displayed on the screen at one time, the list procedure will ask you at which field to begin the display. ZAP A DATABASE The Zap option allows you to quickly erase all records from a database. To use this option, open a database, then choose Zap. KILL - DELETE A DATABASE The Kill option allows you to delete a database and its related missing values files (if any.) When you choose this option, a list of databases will appear on the screen. Choose the database to delete, and the file(s) will be erased from your disk. QUIT/EXIT KWIKSTAT Use this option to end the Kwikstat program and return to DOS. PART IV PERFORMING A STATISTICAL ANALYSIS This section of the KWIKSTAT manual describes the statistical analysis procedures available in the basic KWIKSTAT program. USING DESCRIPTIVE STATISTICS AND GRAPHS The Descriptive Statistics and Graphs module allows you to examine summary statistics of the data in a database. Graphics are used throughout KWIKSTAT to provide visual displays of the data. DETAILED STATISTICS ON A SINGLE VARIABLE This option calculates the mean, standard deviation, median, standard error of the mean, minimum, maximum, sum, and variance of a set of data. In this option, KWIKSTAT also calculates five percentiles and computes a two-sided confidence interval about the mean. If you do not specify otherwise, the default percentiles (Tukey's five number summary: 0, 25th, 50th, 75th, 100th percentiles) and default level of confidence (95%) are used. If the Tukey five number summary is used, a -------------------------------------------------------------------- Become a Registered User, Print Order Form 28 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- box plot is also displayed. For sample sizes less than or equal to 30, a t-statistic is used to calculate for confidence interval. For sample sizes over 30, the 95% (two-sided) z-statistic (1.96) is used. SUMMARY STATISTICS ON A NUMBER OF VARIABLES This option is similar to the above Descriptive statistics on a single variable, but in this option several variables can be summarized using descriptive statistics (sample size, mean, standard deviation, minimum, maximum, and standard error of the mean). If you have a grouping variable in your database, you may request output of summary statistics by group. You are also given the opportunity to print results to the printer, or to output results to a file. APPROXIMATE P-VALUE DETERMINATION This option calculates p-values for entered values of four test statistics: normal (z), student's t, F, chi-square. If you designate the statistic being used, degrees of freedom and the calculated value of the test statistic, KWIKSTAT will tell you the p-value associated with that test statistic. PRODUCING A HISTOGRAM This procedure produces a histogram from values read from a database. A histogram can be helpful in determining if the distribution of a continuous variable is approximated by a normal distribution. If the histogram has a peak toward the center, with both tails diminishing, the data could be considered to be approximated by a normal distribution. PRODUCING AN XY-PLOT (SCATTERPLOT) This option enables you to produce a scatterplot of two variables. A scatterplot is simply a plot of all the data values plotted one variable against the other. Such a plot is helpful in determining if two variables are related, and if the relationship is linear (a straight line), curvilinear, or something else. This information is important for regression and correlation. (Scatterplots can also be produced from the Regression & Correlation module.) -------------------------------------------------------------------- Become a Registered User, Print Order Form 29 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- EXAMPLE 4.5: SCATTERPLOT This example uses the EXAMPLE database file on the KWIKSTAT disk. Suppose you want to create and display a scatterplot of the TIME1 variable against TIME2. First, you must retrieve the database: RETRIEVING THE DATABASE In the lower left corner of the screen you can see the name of the database currently in use. If it's the one you want, EXAMPLE in this case, go on to Performing the Analysis. If another database is in use, see Example 4.2 for detailed instructions on retrieving EXAMPLE. PERFORMING THE ANALYSIS From the Descriptive Statistics and Graphs menu, select XY-Plot (Scatterplot). You will be prompted to enter two fields, or variables, to use. Since you want to do a scatterplot of TIME1 against TIME2, enter 3,4. You will then be prompted for specifications for the plot. You may use the default settings (by simply pressing Enter at each prompt) or you may set your own. (The default setting is no grid lines and points not connected.) KWIKSTAT will draw the scatterplot according to the specifications. The disappearing menu at the bottom of the screen gives you the option to (P)rint. (Press Enter to make the bottom menu reappear.) You can use the (R)eplot option to play around with the specifications of the scatterplot. Pressing Esc takes you back to the Choose Analysis Option menu. Figure 4.4 shows the scatterplot of TIME1 and TIME2 of the EXAMPLE database using default specifications. This plot suggests that the two variables, TIME1 and TIME2 are related positively. As TIME1 increases, so does TIME 2. TIME SERIES PLOT This option enables you to produce a time-series plot for one variable. This plot is useful in examining data that is time related, such as profit by month, etc. The X axis is assumed to be "time". The data values must be entered into records in chronological order the observations occurred, i.e., the first record must contain the results of the first observation (first time period), etc. -------------------------------------------------------------------- Become a Registered User, Print Order Form 30 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- USING T-TESTS AND ANOVA PROCEDURES T-tests and Analysis of Variance (ANOVA) procedures are used to test hypotheses about population means using data obtained through random sampling of those populations. For example, if you do an experiment in which you give different treatments to different groups (e.g., different fertilizers to different groups of plants), the response measurements (e.g., plant heights) will very likely all be different and the average responses for the different groups will be different. PARAMETRIC INDEPENDENT GROUP ANALYSIS Independent group analysis is appropriate when observations are taken from groups in which subjects in one group do not appear in another group. That is, the observations within as well as between groups are independent of one another. In this module, a t-test is performed when there are two groups, and an ANOVA is performed when there are three to ten groups being compared. When performing a t-test or ANOVA on two or more independent groups, you are testing the hypotheses: Ho: The difference in the means of the groups is zero. Ha: The difference in the means of the groups is not zero. For a two-sample t-test, two t-statistics are calculated, one for the case in which the variances of the two samples are equal and the other for use in the case of unequal variances. KWIKSTAT performs a test of the hypothesis that the variances are equal. If the p-value is small (e.g., less than 0.05), the hypothesis of equal variances is rejected and you use the t-statistic for unequal variances. If the p-value is large, use the t-statistic for equal variances. Since the observations are all independent of one another, each observation is entered as an individual record in the database. The number of data records must be the same as the total number of observations. Each record includes the response value of one observation and a number or character to indicate to which treatment group it belongs. That is, there will be two fields (variables), one in which to record the response and one in which to indicate the group. EXAMPLE 4.8. SINGLE FACTOR ANOVA When more than two independent groups are compared with respect to one variable, one-way or single factor analysis of variance techniques are appropriate.This example uses data for hogs which have been randomly assigned to four groups, with each group being given a different feed. The response is weight gain. Data for Independent Group ANOVA -------------------------------------------------------------------- Become a Registered User, Print Order Form 31 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Gp 1 Gp 2 Gp 3 Gp 4 60.8 78.7 92.6 86.9 67.0 77.7 84.1 82.2 54.6 76.3 90.5 83.7 61.7 79.8 90.3 CREATING A DATABASE The database to analyze this data is similar to the one used for Example 4.7 above, differing only with respect to the number of groups. In fact, this one-way ANOVA is an extension of the t-test when there are three or more groups. Create a database (named e.g., HOGFEED) with two fields: GROUP (or you may want to call this field FEED) and WEIGHT. The groups will be numbered 1,2,3,4 according to the type of feed used. The contents of a database option, the HOGFEED database should look like this: RECNO GROUP WEIGHT 1 1 60.8 2 1 67.0 3 1 54.6 4 1 61.7 5 2 78.7 6 2 77.7 7 2 76.3 8 2 79.8 9 3 92.6 10 3 84.1 11 3 90.5 12 4 86.9 13 4 82.2 14 4 83.7 15 4 90.3 PERFORMING THE ANALYSIS Select the Analyze menu. When the Analyze pull-down menu appears, choose the t-tests and Analysis of Variance (ANOVA) option. The t-tests and Analysis of Variance Choose Analysis Option menu will appear. Select Compare independent groups (t-test, ANOVA). You will be prompted to choose the field name of the grouping variable, which in this case is GROUP. Choose GROUP. Next, you will be asked for the data field. Choose WEIGHT, the response variable. KWIKSTAT will now perform the calculations and display the results on the screen, as -------------------------------------------------------------------- Become a Registered User, Print Order Form 32 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- illustrated in Figure 4.7. The results of this test are summarized in the p-value. In this case, the small p-value (0.000) means that there is a significant difference between groups. The p-value (rounded off) is less than 0.0005. That is, the difference between these observed averages is so far from zero that the chance of getting differences farther from zero is less than five in 10,000 if the true means are equal (their difference is zero). This is taken as evidence of a "real" difference between feeds, a difference not due to chance. A p-value of this magnitude is often reported as p <.01 in research literature. The ANOVA tells you only that there is a difference among the feeds.In order to find out which groups are significantly different from which others, press M to choose (M)ultiple comparison. The Newman-Keuls multiple comparison test will describe which of the means are significantly different from which others (at the 0.05 significance level). Figure 4.8 displays a graphical representation of the Newman-Keuls multiple comparisons test. The group numbers are given in increasing order of the value of their group means. That is, Group 1 has the smallest mean, Group 3 the largest. At the 0.05 significance level, the means of any two groups underscored by the same line are not significantly different. This display tells you that (at the 0.05 significance level): 1) The mean for group 1 (feed 1) is statistically significantly less than the means for all other groups. 2) The mean for group 2 (feed 2) is significantly greater than the mean for group 1, and significantly less than the means of groups 4 and 3. 3) The means for groups 4 and 3 are not significantly different from each other, but they are both significantly greater than the means of groups 1 and 2. You can conclude that feeds 3 and 4 are better than feeds 1 and 2, but there is not enough evidence to say that either feed 3 or 4 is the best overall. Box plots are also available to graphically illustrate the differences between the groups. Type G (for graphical comparison) and press Enter to produce the plots. -------------------------------------------------------------------- Become a Registered User, Print Order Form 33 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- PARAMETRIC REPEATED MEASURES (PAIRED) ANALYSIS Repeated measures are observations taken on the same or related subjects over time or in differing circumstances. Examples would be weight loss, or reaction to a drug across time. Repeated measures may also be matched subjects. In this module, as in the independent groups module, a t-test is performed when there are two groups (two repeated measures), and an analysis of variance is performed by KWIKSTAT if there are three to ten groups. The ANOVA determines if there is a difference in the means across groups or repeated measures. A multiple comparison procedure further identifies where the differences lie. In a database for paired or repeated measures data, each record represents one subject (e.g., person, animal). There must be one field for each repeated measure (each treatment group). For paired data, there are two groups, hence two fields. Thus, in each record, there is a field in which to enter data from each observation (treatment) on that subject. This repeated measures (paired) analysis requires that all values be available for each subject and any subject with missing values is eliminated from the analysis. That is, a data record must have a value for each field, or it will be eliminated. The hypotheses being tested with a paired t-test or a repeated measures ANOVA is: Ho: There is no difference among means of the groups (repeated measures). Ha: There is a difference among means of the groups. For comparing matched or paired data (not independent) from two groups, a paired t-test is used. EXAMPLE 4.9: PAIRED T-TEST The data in this example are before and after weights for eight persons on a diet. Notice that in this case, both data values are taken from the SAME entity (person). Data for paired t-test Person Before After 1 162 168 2 170 136 3 184 147 4 164 159 5 172 143 -------------------------------------------------------------------- Become a Registered User, Print Order Form 34 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- 6 176 161 7 159 143 8 170 145 CREATING THE DATABASE The database will include two fields (BEFORE and AFTER) and eight records, one for each person. Since the observations are paired, not independent, the database reflects this by having each record contain a pair of observations. Each record, that is, each person, is independent of the over seven persons, but within a record, the before and after observations are not independent of each other. To create this database: Create a database called DIET containing the data list above. Use the pre-defined database structure called, FOR PAIRED T-TEST OR McNEMAR's TEST This will create a database with the fields VAR1 and VAR2. The VAR1 will be used for Before and VAR2 will be used for After. Of course, you can choose to create a custom database and enter a structure containing the fields named BEFORE and AFTER. Enter the data into the database. The data you will enter in the first record is 162 (press Enter) and 168 (press Enter). Enter the data for the eight records. When you exit the entry procedure, KWIKSTAT will return to the Data main menu. PERFORMING THE ANALYSIS Select the Analyze option form the main KWIKSTAT menu. Next, choose the t-tests and Analysis of Variance (ANOVA) option. The t-test and Analysis of Variance Choose Analysis Option menu will appear. Then choose Compare repeated or paired data (t-test, ANOVA). You will be prompted to choose the fields which you wish to compare. Choose BEFORE and AFTER. KWIKSTAT will now perform the calculations and display the results on the screen. The means and standard deviations for each group are displayed, but more importantly, the mean difference between BEFORE and AFTER measurements is given. The statistical procedures are performed on this average difference. A 95% confidence interval for the mean difference is given, as well as a calculated t-statistic and a p-value. These results are interpreted like those of a single sample t-test with null hypothesis: mean=0, and alternative hypothesis: mean <> 0. -------------------------------------------------------------------- Become a Registered User, Print Order Form 35 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- The calculated t-statistic is 2.37. The test is performed with 7 degrees of freedom, and the p-value associated with the test is 0.008. A small p-value such as this is usually interpreted to indicate rejection of the null hypothesis and leads to the conclusion that the average difference in BEFORE and AFTER weights is not zero, i.e., there is evidence of a significant (at the 0.05 level) change of weight in these eight subjects on average. Since KWIKSTAT uses the difference "second minus first variable" (i.e., "before minus after") to compute the t-statistic, and since the mean difference is positive, on average before weights are greater than after weights which implies that the change is loss of weight on average. INDEPENDENT GROUP TESTS FROM SUMMARY DATA This option allows you to perform a one-way ANOVA or a t-test if you have only the means, standard deviations and group sizes of two to ten groups. Since data are summary, no box plots can be given. USING NON-PARAMETRIC COMPARATIVE PROCEDURES Non-parametric procedures are appropriate when the assumption of normality cannot be made for a small data set or when a large data set is known to be from a non-normal population. Non-parametric procedures are generally based on ranks rather than actual data values, so these procedures can be useful also when actual data values are not known, but the order or ranks of the data values are known. NON-PARAMETRIC INDEPENDENT GROUP ANALYSIS - MANN-WHITNEY AND KRUSKAL-WALLIS TESTS This option is appropriate if you are comparing two or more independent groups, but you cannot make the assumption that the observed data follow a normal distribution or that the variances are equal. It is also useful if you do not have exact data values for the observations but you do have order statistics, that is, you don't know the exact response values but you know which is largest, next largest, and so forth, to smallest. The samples must be randomly and independently taken from populations that differ only with respect to location, and the variable of interest should be continuous. See Zar, 1984. In the Non-Parametric Comparison Tests Module, KWIKSTAT uses the Mann-Whitney procedure if two independent groups are being compared, and the Kruskal-Wallis procedure if three or more groups are being compared. The hypotheses being tested are: -------------------------------------------------------------------- Become a Registered User, Print Order Form 36 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Ho: There is no difference in the medians of the groups. Ha: There is a difference in the medians of the groups. The Mann-Whitney and Kruskal-Wallis non-parametric procedures differ from the independent groups analysis described in the previous section in that the ranks, or order, of the data are used for the analysis rather than the data values themselves. FRIEDMAN'S TEST When repeated observations are taken from the same subject, and there is interet in comparing the observations for each repeated measure (e.g., each type of treatment), then a repeated measures analysis may be appropriate. If you cannot make the assumption that the data are normal, the a nonparametric analysis is appropriate. One method of performing such as test the Friedman's Analysis. COCHRAN'S Q TEST Cochran's Q test is a non-parametric procedure appropriate for use with dichotomous data when the experiment involves repeated measures on blocks. Often the blocks are subjects (people or animals). The response of the subjects to the treatments is dichotomous if it is taken as one of only two possible outcomes, often labeled "success" and "Failure", rather than as a measurment. USING REGRESSION & CORRELATION PROCEDURES To examine the linear relationship between variables, correlation and linear regression are used. Simple linear regression is used for predicting a value of a dependent variable using an independent variable. Multiple regression is used for predicting the value of a dependent using one or more independent variables. Correlation is used to measure the strength of association between two variables. For example, you may be interested in relating advertising to orders received. The question you are asking is, "Is there a relationship between the amount of money spent on advertising and the amount of orders received?" It is also possible to compare more than two variables at a time using multiple regression. For example, you may be interested in how the combination of radio advertising costs, direct mail costs and commissions relate to the number of orders received. -------------------------------------------------------------------- Become a Registered User, Print Order Form 37 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- SIMPLE LINEAR REGRESSION ANALYSIS When you choose the Simple Linear Regression option, KWIKSTAT will prompt you to choose the "independent" and "dependent" variables to be used in the analysis. The "independent" variable is generally that variable that you can choose, regulate or specify (e.g., amount of money spent on advertising) and the "dependent" variable is the one you observe and would possibly like to predict. After the two variables are chosen, KWIKSTAT will present the results of its calculations. The regression equation will be displayed along with other results. This equation is the "least squares" line fitted to the data. If the fit is appropriate, the equation may be used to predict a new value of the dependent variable given the value of the independent variable, within the range of the original data. MULTIPLE REGRESSION ANALYSIS Multiple regression is an extension of simple linear regression into several dimensions (several independent variables). In the multiple regression procedure, you must enter a list of the independent variables and a single dependent variable on which you wish to perform the regression analysis. In KWIKSTAT you may use up to 10 independent variables in this option. Multiple regression can be complicated. Refer to a good text on the subject before making any conclusions about your results. KWIKSTAT calculates and displays several results, including the coefficients and intercept of the regression "line". A significance test is performed to determine the significance of the contribution of the different variables or factors to the model (mathematical representation).Also displayed is R-square (R2), as well as adjusted R-square. R-square varies from 0.0 to 1.0, with 0.0 meaning no relationship (model is not good) and 1.0 meaning the regression equation perfectly describes the sample data. An analysis of variance is performed to determine the overall significance of the model. If the ANOVA reveals a significant relationship, (that is, if the p-value is small) the model may be a good representation of the sample data. A plot of residuals from the fit is available. You may plot the fit against any of the variables. Look for patterns in the residuals. Patterns other than a horizontal band about zero suggest that the assumptions necessary for regression analysis may be violated. If you are unfamiliar with multiple regression, the Neter and Wasserman book -------------------------------------------------------------------- Become a Registered User, Print Order Form 38 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- contains an excellent treatment. EXAMPLE 4.20: MULTIPLE REGRESSION ANALYSIS (LONGLEY DATA) Longley introduced a data set which has often been used in comparing multiple linear regression procedures in the literature. The variables refer to economic factors. This example uses the LONGLEY database on the KWIKSTAT disk. The LONGLEY database consists of 7 fields: DEFLATOR, GNP, UNEMP, ARMED, POP, TIME, and TOTAL. The first six of these will be used as independent variables and the seventh, TOTAL, is the dependent variable (the one to be predicted). Figure 4.15 displays the LONGLEY database. You can get this display by using the List (display) the contents of a database option on the Data main menu. PERFORMING THE ANALYSIS Open the LONGELY database. From the Analyze menu in the main KWIKSTAT module, select the Regression & Correlation module. From the Correlation and Regression menu select Multiple Linear Regression analysis. You will be prompted to enter the INDEPENDENT VARIABLE(S), which in this case are DEFLATOR, GNP, UNEMP, ARMED, POP, TIME. Enter any combination of 1,2,3,4,5,6 to choose the variable(s) you wish to analyze against TOTAL. One way to approach a multiple regression problem is to first include all of the independent variables. After initial analysis (see below) you may decide to eliminate those independent variables found to not be significant. After entering the independent variables, you will be asked for the DEPENDENT VARIABLE. Enter 7, which chooses TOTAL. KWIKSTAT will now perform the calculations and display the results on the screen, as illustrated in Figure 4.16. EXAMPLE 4.22: CORRELATION MATRIX (LONGLEY DATA) Select the Correlation matrix option from the Regression and Correlation menu. You will be prompted to choose variables from the list of fields that appears. In this case, there are seven fields, and you can choose any combination of them. If you want correlation coefficients of all pairs of the seven variables, type 1,2,3,4,5,6,7 and press Enter. KWIKSTAT will perform the calculations and display the 7 by 7 array shown in Figure 4.17. Only half of the array is displayed since the other half is a mirror image. The diagonal entries are also omitted since they are all one; a variable is always perfectly correlated with itself. Each entry in the -------------------------------------------------------------------- Become a Registered User, Print Order Form 39 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- array consists of two numbers (three numbers if the information is viewed or printed to a printer). The first (upper) is the Pearson's correlation coefficient for the two (row and column) variables of that entry. The second (middle) number, in parentheses, is the p-value of the t-test for Ho: rho = 0 vs. Ha: rho <> 0. If you view the resuts, the third (bottom) number, in brackets, is the sample size, or number of paired observations used in the calculations. Both the correlation coefficient and the p-value are interpreted as they are for any correlation of two variables (see Example 4.21 above). In this array, for example, POP and TIME are highly correlated (r=0.994, p=0.00) but POP and ARMED are not (r= 0.364, p=0.17). EXAMPLE 4.23: GRAPHICAL CORRELATION MATRIX (LONGLEY DATA) From the Simple and Multiple Regression Choose Analysis Option menu, select Graphical Correlation matrix. You will be prompted to choose variables from the list of fields that appears. In this case, there are seven fields, and you can choose any combination of them. If you want correlation coefficients of all pairs of the seven variables, type 1,2,3,4,5,6,7 and press Enter. KWIKSTAT will perform the calculations and display the 7 by 7 array of scatterplots shown in Figure 4.18. These scatterplots are a visual way of examining the relationships between pairs of variables. It allows you to determine if a relationship exists between the variables, and allows you to see if that relationship is linear. The more highly correlated two variables are, the more tightly clustered about a straight line are the points on the scatterplot. USING FREQUENCY AND CROSSTABULATION PROCEDURES The Crosstabulations, Frequencies, Chi Square module performs analyses on categorical data, that is, data observed in categories, rather than measurement data. Previous examples using measurement data include weights of hogs, weights of people, heights of plants, numbers of handguns and homicides, and dollar amounts. If, rather than taking a measurement, a data observation involves identifying which of a set of categories the observation falls into, you are working with categorical data. For example, you may identify a person by sex, eye color or hair color. You may identify families, individuals or institutions by geographical region or socioeconomic status identified by levels 1,2,3,4,5. You may identify employees by level of job satisfaction, -------------------------------------------------------------------- Become a Registered User, Print Order Form 40 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- where there are three or four levels to choose from. Generally, categorical data are entered into a database by using one record for each person or entity on which the observation is made and one field for each characteristic which is divided into categories. For example, to categorize ten people by sex, hair color and eye color, you would need ten records (one per person) and three fields (e.g., SEX, HAIR, EYE). Some of the procedures in this module give you the choice of simply entering totals for each category rather than creating a database and entering the results of each observation. This can save time if totals are known and only totals are needed to perform a test or calculation or to produce a graph. PERFORMING A FREQUENCIES ANALYSIS In the Frequencies option, KWIKSTAT "counts" the occurrence of each data value for a single variable or field and displays that information in a table. You can also create a bar chart, pictograph and/or pie chart of this information using this option. EXAMPLE 4.24: FREQUENCY TABLE, PICTOGRAPH, BAR AND PIE CHARTS This example uses the EXAMPLE database file on the KWIKSTAT disk. One of the fields (variables) in this database is STATUS referring to socioeconomic status. Suppose you want to know how the total data set is divided up into the five levels of STATUS. You also want to produce a visual display of this information. Open the EXAMPLE database. Performing the Analysis From the Frequencies and Crosstabulations Choose Analysis Option menu, select Frequencies, Pictograph, Pie Chart. You will be prompted to enter one field (variable) to use. Since you want to do a frequency table on STATUS, enter 7. KWIKSTAT will count the data in each of the five categories of STATUS and display the results as a frequency table. You are then prompted to press Enter, which takes you to the Frequencies Analysis menu. From this menu you may choose to go back and do another analysis, or create charts. -------------------------------------------------------------------- Become a Registered User, Print Order Form 41 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- PERFORMING A GOODNESS OF FIT ANALYSIS A goodness-of-fit test of a single population is a test to determine if the distribution of observed frequencies in the sample data closely matches the expected number of occurrences under a hypothetical distribution of the population. The data observations must be independent and each data value can be counted in one and only one category. It is also assumed that the number of observations is fixed. The hypotheses being tested are Ho: The population distribution begin samples follows the hypothesized distribution. Ha: The population does not follow the hypothesized distribution. PERFORMING A CROSSTABULATION ANALYSIS (CHI-SQUARE) Crosstabulations can be used to perform a chi-square test for independence or a chi-square test for homogeneity. A two-way table is constructed that displays the number of counts for each category. It must be possible to assume that the data observations are independent and that each data value can be counted in one and only one category. It is also assumed that the number of observations is fixed. KWIKSTAT allows you to enter data for a two-way table from the keyboard or from a database. When you choose to enter the two-way table from the keyboard, KWIKSTAT will ask you the size of the table (number of rows and columns). A blank table will be presented on the screen, and you will then be prompted to enter a number in each cell of the table. If you choose to enter the information from a database, KWIKSTAT will prompt you to enter a list of tables to be calculated. For example '2 BY 3' specifies a tablulation of field 2 by field 3. The specification '1 BY 2,3' specifies the tables 1 BY 2 and 1 BY 3. You may list up to ten field numbers on both sides of the BY. KWIKSTAT will read the information from the database, and construct the table or tables. For instance, in the EXAMPLE database, if you choose to tabulate the variables GROUP and STATUS, KWIKSTAT will form the table on the screen as illustrated in Figure 4.23. (Note that the first variable entered is the row variable.) When requesting multiple plots in the BY specification, you will be given the option to choose to pause after each table is displayed, or to continue non-stop and print all of the requested tables to the printer to to a file. For a test for independence, a contingency table looks at two categorical variables from a single sample of one population and tests whether the two variables are related in some way, (e.g., are sex and -------------------------------------------------------------------- Become a Registered User, Print Order Form 42 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- hair color related?) The hypotheses being tested are: Ho: The variables are independent of each other. (There is no association between them). Ha: The variables are not independent of each other. KWIKSTAT reports both the chi-square statistic and the p-value. If the expected value (Eij) in one or more cells is less than 5, the chi-square test may not be valid. A warning to this effect appears on the screen if appropriate. In the case of a 2 by 2 table, Fisher's Exact Test and the chi-square with Yates' correction are also performed and results displayed. EXAMPLE 4.26: CROSSTABULATION ANALYSIS (2 BY 2) TEST FOR INDEPENDENCE Data for this example are observations of the number of beetles and bugs on the upper and lower sides of leaves (Zar,1974, page 292). 2 by 2 Contingency Table Data Beetles Bugs --------------- Upper Leaf 12 7 Lower Leaf 2 8 Since you are given only the totals for each of the four categories, and not the individual data for each leaf, there is no need to create a database. Rather, you can just enter these totals from the keyboard. The calculated chi-square statistic in this case is 4.89 with a p-value of 0.028. The chi-square with Yates correction is 3.31 with a p-value of 0.069 and the Fisher Exact Test (two tail) has a p-value of 0.050. Because one of the cells produces an expected value less than 5, KWIKSTAT gives a warning that the chi-square analysis for this data may not be valid. Given this warning, it is best to rely on the Fisher's Exact Test for making a decision. A decision can be made using the p-value of the test. A low p-value (less than or equal to the chosen significance level) is usually taken to indicate rejection of the null hypothesis. At a 0.05 significance level, the Fisher's Exact Test p-value of 0.050 indicates (on the borderline) that there is enough evidence to reject the null hypothesis of independence of the two variables and to conclude that leaf side and type of insect are not independent. In this case it appears that beetles prefer the upper sides of leaves and bugs are about split in their preference. In the case of the Yates results, this decision is marginal. After -------------------------------------------------------------------- Become a Registered User, Print Order Form 43 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- viewing the crosstabulation results, press Enter and a Crosstabulations Analysis menu will appear. This menu gives you the options of doing another analysis, printing what you have done, or producing a 3-dimensional bar chart. Select by highlighting the desired option and pressing Enter. DRAWING A 3-D BAR CHART KWIKSTAT allows you to draw a 3-dimensional bar chart of data for a contingency table (crosstabulation), and then to focus in on a part of it if desired. Data for the 3-dimensional bar chart must be entered first, either from the keyboard or a database, by using the Crosstabulations, Chi-Square option of the Frequencies and Crosstabulations Choose Analysis Option menu. To get to this menu from the Data main menu, select Analyze at the top of the screen, and then select Crosstabulations, Frequencies, Chi Square. MCNEMAR'S TEST McNemar's test is appropriate for use with paried, dichotomous data. This test is sometimes called a test for related samples or a test for the significance of changes. It is used ehen the response is one of only two possible outcomes. McNemar';s test is the 2 by 2 version of Cochran's Q test described earlier. The test assumes that any pair of observations is independent of any other pair or observations, although clearly the observations with a pair are not independent of each other. USING LIFE TABLES AND SURVIVAL ANALYSIS PROCEDURES As the name indicates, this module performs life tables and survival analysis procedures. The data must be in the following form: 1) a TIME variable which contains a time (e.g., minutes, days, years, etc.) in which the subject or component has been observed to be alive (not failed). 2) a CENSOR variable which must take on the values 0 or 1, where 1 means the subject has died (failed), and a 0 means the subject was still alive (not failed) at the last available time period. 3) optionally, a GROUPING variable which may have up to ten values (numeric or character), i.e., the data may be in groups. Once the data are entered into the program, a life table for each -------------------------------------------------------------------- Become a Registered User, Print Order Form 44 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- group is produced which includes, for each time interval, the number entered, withdrawn, lost, dead, exposed, the proportion dead, proportion surviving, cumulative proportion surviving, hazard and density. A plot is given for the cumulative proportion surviving in the group(s) against time. If more than one group is entered, a Mantel-Haenszel test is performed to test the hypothesis of equal survival patterns for the groups. A small version of the survival plot will appear on the screen, and if you choose to print a report of the session the report will include a larger version of the plot along with other information from the analysis. PERFORMING A LIFE TABLE ANALYSIS Survival analysis is used to summarize information in life tables, to examine survival trends over time, and to compare survival times between groups. EXAMPLE 4.31A: LIFE TABLE ANALYSIS The data for this example are in the LIFE database on the KWIKSTAT disk. These data are from Prentice (1973). Open the database named LIFE. The LIFE database consists of 3 fields: SURVIVAL, CENSOR, and GROUP. Figure 4.30 displays a portion of the LIFE database. You can get this display by using the List (display) the contents of a database option on the Data main menu. The first column is the SURVIVAL field with entries of length of life, or length of survival. The second column is the CENSOR field, an indicator of whether the subject has failed (died) or not at the last observed time period. 1 means failed, 0 means not failed (still alive). The third column contains a grouping variable. In this case it is either 1 or 2. Group 1 may represent one treatment, while group 2 represents another kind of treatment. The objective is to compute survival curves to see if the treatments provide different average survival distributions. -------------------------------------------------------------------- Become a Registered User, Print Order Form 45 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- APPENDIX INTERPRETING ERROR CODES If the program encounters a problem it does not know how to resolve, it will usually display an error message. This message will contain an error code and a reference code. Many times, you can correct this error situation by understanding what caused it. For example, if you were to get an error number 27, you would know that it was caused by your printer sending an "Out of Paper" message to the program. If you are unable to resolve the problem, write down the steps taken before the error occured, and send it to TexaSoft on the Problem Report Form. We will try to resolve the problem as quickly as possible. Error Codes: Error Number 5 = Illegal function call Error Number 6 = Overflow Error Number 7 = Out of Memory Error Number 9 = Subscript out of range Error Number 11 = Division by zero Error Number 14 = Out of String Space Error Number 24 = Device Timeout Error Number 25 = Device fault Error Number 27 = Out of Paper Error Number 50 = FIELD overflow Error Number 51 = Internal Error Error Number 52 = Bad filename or number Error Number 53 = File not found Error Number 54 = Bad file mode Error Number 55 = File already open Error Number 57 = Device I/O error Error Number 58 = File already exists Error Number 61 = Disk full Error Number 62 = Input past end of file Error Number 63 = Bad record number Error Number 64 = Bad filename Error Number 67 = Too many files Error Number 68 = Device unavailable Error Number 70 = Permission denied Error Number 71 = Disk not ready Error Number 72 = Disk media error Error Number 74 = Rename across disks Error Number 75 = Path/File access error Error Number 76 = Path not found Error Number 81 = Invalid filename -------------------------------------------------------------------- Become a Registered User, Print Order Form 46 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- Problem Report form: KWIKSTAT Please explain in detail the problem that occurred. If possible, send a print out of the results or Print Screen. KWIKSTAT VERSION YOU ARE USING:________________________ KWIKSTAT MODULE where problem occurred:____________________ YOUR COMPUTER: BRAND/Model_____________________________ MONITOR TYPE:________AMOUNT OF MEMORY:_______________ VERSION OF DOS YOU ARE USING:____________________________ MEMORY RESIDENT PROGRAMS YOU USE:____________________ PROBLEM: Mail to:TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75104. Or fax to 214-291-3400, or send E-Mail to Compuserve 70721,3145. -------------------------------------------------------------------- Become a Registered User, Print Order Form 47 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- USER'S BALLOT Please indicate your preference for improvements in KWIKSTAT. On a scale of 0 to 10 0 = Very Low priority for this change 10 = Very High priority for this change Vote Proposed item of change ---- ----------------------------------------------------- ____ More "BY GROUP" capabilities ____ Ability to sort database ____ Add more ANOVA types ____ Add more Non-parametric statistical tests ____ Add General Linear Model ____ Make Report more flexible ____ Add Quality Control Module ____ Speed up program functions ____ Add more graphics, what kind? ____ Improve graphic quality ____ Add cluster analysis ____ Add discriminant analysis ____ Automate analysis from a command file ____ _____________________________________________ ____ _____________________________________________ Other Comments: Mail to:TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75104. Or fax to 214-291-3400, or send E-Mail to Compuserve 70721,3145. -------------------------------------------------------------------- Become a Registered User, Print Order Form 48 KWIKSTAT CONDENSED MANUAL Version 3.3 -------------------------------------------------------------------- S H A R E W A R E _________________ TRY IT BEFORE YOU BUY IT The purpose of shareware products is to allow you to try software products before you buy them. KWIKSTAT is not a public domain program. Persons who use KWIKSTAT on a regular basis should purchase a copy. You receive several benefits from becoming an official registered user: 1. You help to keep the product growing to meet your needs. 2. You receive the very latest version, with a printed, bound, and expanded manual. 3. You receive periodic newsletters announcing new releases, and pointing out important information on any bugs and fixes. 4. You are able to purchase update to new versions for a minimal cost. Print the file on disk named KSORDER.TXT to register. Thanks. ┌─────────┐ ┌─────┴───┐ │ (tm) ──│ │o │────────────────── │ ┌─────┴╨──┐ │ Association of │ │ │─┘ Shareware └───│ o │ Professionals ──────│ ║ │──────────────────── └────╨────┘ MEMBER This program is produced by a member of the Association of Shareware Professionals (ASP). ASP wants to make sure that the shareware principle works for you. If you are unable to resolve a shareware- related problem with an ASP member by contacting the member directly, ASP may be able to help. The ASP Ombudsman can help you resolve a dispute or problem with an ASP member, but does not provide technical support for members' products. Please write to the ASP Ombudsman at 545 Grover Road, Muskegon, MI 49442 or send a CompuServe message via CompuServe Mail to ASP Ombudsman 70007,3536. 49